Scaling Up Decentralized MDPs Through Heuristic Search
Identifieur interne : 001997 ( Main/Exploration ); précédent : 001996; suivant : 001998Scaling Up Decentralized MDPs Through Heuristic Search
Auteurs : Jilles Steeve Dibangoye [France] ; Amato Christopher [États-Unis] ; Doniec Arnaud [France]Source :
Abstract
Decentralized partially observable Markov decision processes (Dec-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete). The transition and observation independent Dec-MDP is a general subclass that has been shown to have complexity in NP, but optimal algorithms for this subclass are still inefficient in practice. In this paper, we first provide an updated proof that an optimal policy does not depend on the histories of the agents, but only the local observations. We then present a new algorithm based on heuristic search that is able to expand search nodes by using constraint optimization. We show experimental results comparing our approach with the state-of-the-art DecMDP and Dec-POMDP solvers. These results show a reduction in computation time and an increase in scalability by multiple orders of magnitude in a number of benchmarks.
Url:
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: 004362
- to stream Hal, to step Curation: 004362
- to stream Hal, to step Checkpoint: 001613
- to stream Main, to step Merge: 001A26
- to stream Main, to step Curation: 001997
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Scaling Up Decentralized MDPs Through Heuristic Search</title>
<author><name sortKey="Dibangoye, Jilles Steeve" sort="Dibangoye, Jilles Steeve" uniqKey="Dibangoye J" first="Jilles Steeve" last="Dibangoye">Jilles Steeve Dibangoye</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-205124" status="OLD"><idno type="RNSR">200218290B</idno>
<orgName>Autonomous intelligent machine</orgName>
<orgName type="acronym">MAIA</orgName>
<date type="end">2014-12-31</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/maia</ref>
</desc>
<listRelation><relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423090" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-129671" type="direct"><org type="laboratory" xml:id="struct-129671" status="VALID"><idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc><address><addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423090" type="direct"><org type="department" xml:id="struct-423090" status="VALID"><orgName>Department of Complex Systems, Artificial Intelligence & Robotics</orgName>
<orgName type="acronym">LORIA - AIS</orgName>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/complex-system-and-artificial-intelligence</ref>
</desc>
<listRelation><relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect"><org type="laboratory" xml:id="struct-206040" status="VALID"><idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc><address><addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect"><org type="institution" xml:id="struct-413289" status="VALID"><idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc><address><addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author><name sortKey="Christopher, Amato" sort="Christopher, Amato" uniqKey="Christopher A" first="Amato" last="Christopher">Amato Christopher</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-406632" status="VALID"><orgName>Computer Science and Artificial Intelligence Laboratory [Cambridge]</orgName>
<orgName type="acronym">CSAIL</orgName>
<desc><address><addrLine>The Stata Center, Building 32 32 Vassar Street Cambridge, MA 02139</addrLine>
<country key="US"></country>
</address>
<ref type="url">https://www.csail.mit.edu/</ref>
</desc>
<listRelation><relation active="#struct-22441" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-22441" type="direct"><org type="institution" xml:id="struct-22441" status="VALID"><orgName>Massachusetts Institute of technology [Cambridge]</orgName>
<orgName type="acronym">MIT</orgName>
<desc><address><addrLine>Massachusetts Avenue Cambridge, MA 02142</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://mit.edu/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Arnaud, Doniec" sort="Arnaud, Doniec" uniqKey="Arnaud D" first="Doniec" last="Arnaud">Doniec Arnaud</name>
<affiliation wicri:level="1"><hal:affiliation type="department" xml:id="struct-224096" status="VALID"><orgName>École des Mines de Douai</orgName>
<orgName type="acronym">Mines Douai EMD</orgName>
<desc><address><addrLine>941 rue Charles Bourseul - CS10838 - 59508 Douai Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www2.mines-douai.fr/</ref>
</desc>
<listRelation><relation active="#struct-302102" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-302102" type="direct"><org type="institution" xml:id="struct-302102" status="VALID"><orgName>Institut Mines-Télécom</orgName>
<desc><address><addrLine>46 rue Barrault -75634 Paris Cedex 13</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.mines-telecom.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00765221</idno>
<idno type="halId">hal-00765221</idno>
<idno type="halUri">https://hal.inria.fr/hal-00765221</idno>
<idno type="url">https://hal.inria.fr/hal-00765221</idno>
<date when="2012-08-15">2012-08-15</date>
<idno type="wicri:Area/Hal/Corpus">004362</idno>
<idno type="wicri:Area/Hal/Curation">004362</idno>
<idno type="wicri:Area/Hal/Checkpoint">001613</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">001613</idno>
<idno type="wicri:Area/Main/Merge">001A26</idno>
<idno type="wicri:Area/Main/Curation">001997</idno>
<idno type="wicri:Area/Main/Exploration">001997</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Scaling Up Decentralized MDPs Through Heuristic Search</title>
<author><name sortKey="Dibangoye, Jilles Steeve" sort="Dibangoye, Jilles Steeve" uniqKey="Dibangoye J" first="Jilles Steeve" last="Dibangoye">Jilles Steeve Dibangoye</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-205124" status="OLD"><idno type="RNSR">200218290B</idno>
<orgName>Autonomous intelligent machine</orgName>
<orgName type="acronym">MAIA</orgName>
<date type="end">2014-12-31</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/maia</ref>
</desc>
<listRelation><relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423090" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-129671" type="direct"><org type="laboratory" xml:id="struct-129671" status="VALID"><idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc><address><addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423090" type="direct"><org type="department" xml:id="struct-423090" status="VALID"><orgName>Department of Complex Systems, Artificial Intelligence & Robotics</orgName>
<orgName type="acronym">LORIA - AIS</orgName>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/complex-system-and-artificial-intelligence</ref>
</desc>
<listRelation><relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect"><org type="laboratory" xml:id="struct-206040" status="VALID"><idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc><address><addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect"><org type="institution" xml:id="struct-413289" status="VALID"><idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc><address><addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author><name sortKey="Christopher, Amato" sort="Christopher, Amato" uniqKey="Christopher A" first="Amato" last="Christopher">Amato Christopher</name>
<affiliation wicri:level="1"><hal:affiliation type="laboratory" xml:id="struct-406632" status="VALID"><orgName>Computer Science and Artificial Intelligence Laboratory [Cambridge]</orgName>
<orgName type="acronym">CSAIL</orgName>
<desc><address><addrLine>The Stata Center, Building 32 32 Vassar Street Cambridge, MA 02139</addrLine>
<country key="US"></country>
</address>
<ref type="url">https://www.csail.mit.edu/</ref>
</desc>
<listRelation><relation active="#struct-22441" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-22441" type="direct"><org type="institution" xml:id="struct-22441" status="VALID"><orgName>Massachusetts Institute of technology [Cambridge]</orgName>
<orgName type="acronym">MIT</orgName>
<desc><address><addrLine>Massachusetts Avenue Cambridge, MA 02142</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://mit.edu/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Arnaud, Doniec" sort="Arnaud, Doniec" uniqKey="Arnaud D" first="Doniec" last="Arnaud">Doniec Arnaud</name>
<affiliation wicri:level="1"><hal:affiliation type="department" xml:id="struct-224096" status="VALID"><orgName>École des Mines de Douai</orgName>
<orgName type="acronym">Mines Douai EMD</orgName>
<desc><address><addrLine>941 rue Charles Bourseul - CS10838 - 59508 Douai Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www2.mines-douai.fr/</ref>
</desc>
<listRelation><relation active="#struct-302102" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-302102" type="direct"><org type="institution" xml:id="struct-302102" status="VALID"><orgName>Institut Mines-Télécom</orgName>
<desc><address><addrLine>46 rue Barrault -75634 Paris Cedex 13</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.mines-telecom.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Decentralized partially observable Markov decision processes (Dec-POMDPs) are rich models for cooperative decision-making under uncertainty, but are often intractable to solve optimally (NEXP-complete). The transition and observation independent Dec-MDP is a general subclass that has been shown to have complexity in NP, but optimal algorithms for this subclass are still inefficient in practice. In this paper, we first provide an updated proof that an optimal policy does not depend on the histories of the agents, but only the local observations. We then present a new algorithm based on heuristic search that is able to expand search nodes by using constraint optimization. We show experimental results comparing our approach with the state-of-the-art DecMDP and Dec-POMDP solvers. These results show a reduction in computation time and an increase in scalability by multiple orders of magnitude in a number of benchmarks.</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
<li>États-Unis</li>
</country>
<region><li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement><li>Metz</li>
<li>Nancy</li>
</settlement>
<orgName><li>Université de Lorraine</li>
</orgName>
</list>
<tree><country name="France"><region name="Grand Est"><name sortKey="Dibangoye, Jilles Steeve" sort="Dibangoye, Jilles Steeve" uniqKey="Dibangoye J" first="Jilles Steeve" last="Dibangoye">Jilles Steeve Dibangoye</name>
</region>
<name sortKey="Arnaud, Doniec" sort="Arnaud, Doniec" uniqKey="Arnaud D" first="Doniec" last="Arnaud">Doniec Arnaud</name>
</country>
<country name="États-Unis"><noRegion><name sortKey="Christopher, Amato" sort="Christopher, Amato" uniqKey="Christopher A" first="Amato" last="Christopher">Amato Christopher</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001997 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001997 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Main |étape= Exploration |type= RBID |clé= Hal:hal-00765221 |texte= Scaling Up Decentralized MDPs Through Heuristic Search }}
This area was generated with Dilib version V0.6.33. |